This paper discusses the target of loops optimization and various methods of program transformation which can significantly reduce the access time to subscripted variables , diminish some types of dependence , increase the " depth " of software pipelining , and merge some iterations of loops in order to make code compaction easier 程序變換可大大減少下標(biāo)變數(shù)的訪問時(shí)間;消除某些類型的相關(guān),提高軟件流水的“深度” ;合并多個(gè)循環(huán),有利于進(jìn)行代碼壓縮。
We analysed the traditional automatic parallelization technology , including dependency analysis theory , program transformation technology , parallel scheme and the optimization of related synchronization and communication etc , which are the theoretical basis of the whole article . cfd computing features , especially the features of explicit difference computing , have also been further ananlysed . we also summarized drawbacks of traditional automatic parallelization technology used in cfd : small parallel granularity , difficulty in attaining global identical data partition , and difficulty in attaining high parallel efficiency on distributed memory system 本文討論、分析、總結(jié)了通用的自動(dòng)并行化技術(shù):相關(guān)性分析理論、程序變換技術(shù)、并行模式以及同步通信與優(yōu)化問題等等,它們是本文研究工作的理論基礎(chǔ);針對(duì)研究對(duì)象,深入分析了cfd計(jì)算的特點(diǎn),特別是顯式差分計(jì)算的特點(diǎn);并歸納出傳統(tǒng)的自動(dòng)并行化技術(shù)在cfd應(yīng)用中存在的問題:并行粒度小、難以獲得全局統(tǒng)一的數(shù)據(jù)劃分方式,對(duì)于分布存儲(chǔ)結(jié)構(gòu)的并行機(jī)難以獲得高效率。
And then , it gives some useful approaches of program transformation to reduce cache conflicts , and concludes three accessing modes in multi - media applications to prepare for the further study of stream cache prefetching technologies . this paper also introduces a data allocation approach to scratch - pad sram , with the purpose of improving cache hit rate 討論了常用的提高數(shù)據(jù)時(shí)空局部性的程序變換方法以降低cache失效率,并針對(duì)多媒體領(lǐng)域應(yīng)用程序的特點(diǎn),總結(jié)了三種多媒體常見存儲(chǔ)訪問模式,為進(jìn)一步研究并向dpc存儲(chǔ)系統(tǒng)加入流cache的預(yù)取技術(shù)奠定了基礎(chǔ)。